System and method for providing unsupervised learning to associate profiles in video audiences

ABSTRACT

A system and method for providing unsupervised learning to associate profiles in video audiences is provided. The method includes: receiving a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network; deriving set top box signatures from the zapping log and broadcast schedule; clustering viewer profiles into groups of viewer profiles using the set top box signatures; and associating at least one set top box within the network with at least one viewer profile, wherein the method of performing unsupervised learning does not use data associating demographic or psychographic profiles to the at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application claims priority to copending U.S. Provisional Application entitled, “SYSTEM AND METHOD FOR PROVIDING PERSONAL ADVERTISEMENTS FOR AN ACCESS NETWORK,” having Ser. No. 60/956,728, filed Aug. 20, 2007, which is entirely incorporated herein by reference.

FIELD OF THE INVENTION

The present invention relates to advertising, and more particularly is related to providing personal advertisement to video services.

BACKGROUND OF THE INVENTION

Owners of products and services, also referred to herein as advertisers, spend significant funds advertising on television. In addition, advertisers seek to maximize return from their investment in advertising on television by using different techniques. As an example, owners may pay to have an advertisement run at a specific time on a specific channel. Such an advertisement may not only be for products and services, but for any content, such as, but not limited to, video on demand, gaming, and any other content or service. In addition, owners may pay a premium price to have their advertisement run during the showing of popular television programming.

Unfortunately, advertisers do not have control over who may be watching television at a time that an advertisement is run. As a result, finds associated with television advertising are not maximized. Instead, after receiving ratings associated with an aired television show, advertisers pay based upon a previously desired audience and an agreed upon percentage. Funds would be better allocated if a larger number of a specific desired audience could be selected for viewing of targeted advertisements.

Different techniques have been used in an attempt to maximize television advertising investments. Examples of known techniques include attempting to obtain demographic and psychographic profiles, and using information about rating. Unfortunately, information about rating, demographic and psychographic profiles, and targeted rating is obtained using surveys and/or people meters, which are based on small sample audiences and are inaccurate in the collection process. Advertisers, network management, and cable/satellite decision makers would like to use more accurate information for placement and pricing of television advertisements.

Currently, the process of creating television viewer profiles has not made use of the actual actions of the television viewers while watching television. Utilizing information associated with viewer actions while watching television would be very useful in the creating of television viewer profiles.

Thus, a heretofore unaddressed need exists in the industry to address the aforementioned deficiencies and inadequacies.

SUMMARY OF THE INVENTION

Embodiments of the present invention provide a system and method for providing unsupervised learning to associate profiles in video audiences. Briefly described, in architecture, one embodiment of the system, among others, can be implemented as follows. The system contains a head end having a computer and means for communicating therein, wherein the computer has a management application stored therein, and wherein the management application further comprises: logic configured to receive a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network; logic configured to derive set top box signatures from the zapping log and broadcast schedule; logic configured to cluster viewer profiles into groups of viewer profiles using the set top box signatures; and logic configured to associate at least one set top box within the network with at least one viewer profile, wherein performing unsupervised learning does not use data associating demographic or psychographic profiles to the at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures.

The present invention can also be viewed as providing methods for providing unsupervised learning to associate profiles in video audiences. In this regard, one embodiment of such a method, among others, can be broadly summarized by the following steps: receiving a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network; deriving set top box signatures from the zapping log and broadcast schedule; clustering viewer profiles into groups of viewer profiles using the set top box signatures; and associating at least one set top box within the network with at least one viewer profile, wherein the method of performing unsupervised learning does not use data associating demographic or psychographic profiles to the at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures.

Other systems, methods, features, and advantages of the present invention will be or become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features, and advantages be included within this description, be within the scope of the present invention, and be protected by the accompanying claims.

BRIEF DESCRIPTION OF THE DRAWINGS

Many aspects of the invention can be better understood with reference to the following drawings. The components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present invention. Moreover, in the drawings, like reference numerals designate corresponding parts throughout the several views.

FIG. 1 is a schematic diagram illustrating an example of an IPTV network in which the present system may be provided.

FIG. 2 is a flow chart further illustrating the process of personalizing advertisements, in accordance with one exemplary embodiment of the invention.

FIG. 3 is a flow chart further illustrating the process of identifying and associating consumer profiles to set top boxes within a supervised learning scenario.

FIG. 4 is a schematic diagram illustrating an example of a cable network in which the present system may be provided.

FIG. 5 is a schematic diagram illustrating an example of a satellite network in which the present system may be provided.

FIG. 6 is a schematic diagram illustrating an example of a terrestrial network in which the present system may be provided.

FIG. 7 is a flow chart further illustrating the steps of the supervised learning process.

FIG. 8 is a flow chart further illustrating the process of identifying and associating consumer profiles to set top boxes within an unsupervised learning scenario.

FIG. 9 is a block diagram further illustrating functionality of the management application as blocks of logic.

FIG. 10 is a detailed logical flow diagram illustrating a sequence of events performed during unsupervised learning.

DETAILED DESCRIPTION

The present system is capable of learning the viewing habits of video viewers by collecting zapping events and other events performed by the viewer. Such videos may be viewed via a television, hand held device, computer, or any device capable of displaying video. The events may be collected at a set top box, computer, or other device. Alternatively, the events may be collected at a different location, such as, but not limited to, at an access multiplexer located in a head end, or in a device located separate from the head end. The system learns the viewing habits and zapping habits of different population profiles by identifying the viewing profile of a household.

The system uses supervised or unsupervised learning functionality for identifying different population profiles, and provides a representation of the probability (or another form of representation) of each population profile to watch any given program and to present a zapping pattern. The probabilities can be utilized as a tool for advertisers searching for the demographic profile of the audience of a television program, or, using inference functionality described herein, to identify the home audience at each household, and the specific viewers of a television program. Thereafter, the system is capable of supplying personalized content, such as, but not limited to, advertisements, video selections, and other content, to the viewers. It should be noted that the following description provides an example in which the content is an advertisement, however, the invention is not intended to be limited to advertisements, but instead, any content that may be personalized.

The present system collects the operations performed by viewers at service decoders, such as, but not limited to, set top boxes (the term set top box is used hereafter). The system then employs unsupervised or supervised learning functionality, as described herein, to interpret the operations at each set top box as the sum of operations of all viewers associated with this set top box. The system learns to identify different viewer profiles in the population and associates with each set top box and profile a probabilistic model of the viewing and zapping habits of viewers.

It should be noted that the present system and method may be provided within different infrastructures. As an example, the following description provides examples of using the present system and method in an Internet protocol television (IPTV) infrastructure, in a cable infrastructure, and in a satellite infrastructure. While these infrastructures are described herein, the present system and method is not intended to be limited to these infrastructures.

While the following describes the present system and method in detail it is beneficial to provide certain definitions.

Set top box (STB) or service decoder: A set top box or service decoder is a device responsible for converting digital (or analog) content received into viewable content that may be fed into a television set or other monitor. The set top box or service decoder may be located at a household or another location.

Platform: A network of service decoders (e.g., set top boxes) of a specific television service provider.

Passive audience identification: Identification of the viewer's profiles without any specific actions performed by the viewer.

Zapping event: A zapping event is an event where there is switching from a current service to another service, where the switching is performed by, for example, but not limited to, use of a remote control, pushing buttons on the set top box, or any action that causes switching, including, but not limited to, voice commands, or even consumer motions without pressing buttons. In addition, a zapping event may be other means for communicating with a set top box, such as, but not limited to, pressing an electronic program guide, pressing a volume button, and other actions involving the set top box.

Zapping pattern: A zapping pattern is the behavior of a viewing individual in terms of zapping, such as, but not limited to, programs watched, frequency of zapping events, and variance of zapping frequency.

Set top box (STB) zapping signature: A set of zapping events of a particular STB.

Zapping log: Records of the STB zapping signatures for an entire STB network (Platform) or for part of the network.

Channel: A stream of programs broadcasted consecutively from a content source.

Program: Content that was broadcasted on a specific channel at a specific date and time, whether on demand or generally broadcasted.

Program rating: Percent of viewers that watched the program.

Targeted program rating: Percent of viewers of specific profile that watched the program.

Channel rating: Percent of viewers that watched the channel during the specified time period.

Targeted channel rating: Percent of viewers of specific Profile that watched the channel during the specified time period.

Profile: The classification of an individual into one of several population groups that is targeted. Such profiles may be, for example, but not limited to, psychographic (for example, behavioral) or demographic profiles. Examples of such groups include, but are not limited to, gender, age, income, marital status, and possibly also by interests in different fields.

Learning functionality: Functionality used to reduce a large set of observed data and its classification into groups to a set of parameters, allowing to reconstruct the classification of the majority of the original data and to classify similar, unlearned, data, or, to produce a new type of classification. Different relevant learning methods may be utilized to provide the learning functionality such as, but not limited to, artificial neural networks, decision trees, k-Nearest Neighbor, Quadratic classifier, support vector machine, direct probability estimate using Bayesian inference, Bayesian networks, Gaussian estimators, least squares optimization methods, and other optimization methods.

Supervised learning: Supervised learning is learning in which the classification of the observed data is inferred from a sample of the data supplied by an outside source. The learning functionality searches for a parameter set allowing reconstruction of the classification from the input that later can be used for classification of new unlearned data.

Unsupervised learning: Unsupervised learning is learning in which no classification of observed data is given (i.e., no sample is provided), and the functionality attempts to classify the data into different classes under some constraints. The functionality may use a method, such as, but not limited to, vector quantization, and various learning methods and various optimization methods, to find a reduction of the data into representative classes.

FIG. 1 is a schematic diagram illustrating an example of an IPTV network 10 in which the present system may be provided. Specifically, FIG. 1 is specific to video on demand or personalized advertisements for an IPTV infrastructure. As shown by FIG. 1, an IPTV head end 20 is provided, portions of which communicate with at least one customer premises 100A-100D. As is known by those having ordinary skill in the art, a head end is the physical location in an area where a video signal is received by a provider, stored, processed, and transmitted to local customers of the provider. One having ordinary skill in the art would also appreciate that more than one head end may be provided within a network. In addition, a network may have more than one type of head end, such as, but not limited to, a cable head end, a satellite head end, an IPTV head end, and a terrestrial head end.

The head end 20 contains at least a video service splicer 30, an advertisements video server 40, a management application 50, and an access network multiplexer 60. One having ordinary skill in the art would appreciate that the head end 20 may have portions in addition to those mentioned herein. In addition, while the present description refers to a management application, it should be noted that the management application is stored on a computer.

The video service splicer 30 receives video and audio services from a satellite dish 70. It should, however, be noted that video and audio services may be received by devices other than a satellite dish 70, such as, but not limited to, a cable network or any device capable of providing video to the head end 20.

The video service splicer 30 is capable of splicing personal advertisements into a video service stream, as instructed by the management application 50 and as is further described in detail hereinbelow. The video service splicer 30 also receives advertisements from the advertisements video server 40. In addition, actions of the video service splicer 30 are controlled by the management application 50. It should be noted that, for the example of an IPTV network, the video packets received by the video service splicer 30 may carry an Internet protocol (IP) address and a User Datagram Protocol (UDP) port number. It should also be noted that the video service splicer 30 may instead receive video and audio services from a cable fiber.

The access network multiplexer 60 is responsible for routing video services to transmission units 120A-120D that are video services decoders, as explained hereinbelow. The transmission units 120 are each located within a customer premises 100A-100D. The access multiplexer 60 is connected to both the management application 50 and the video service splicer 30. Specifically, the access network multiplexer 60 may perform, for example, IP and UDP port manipulation. It should be noted that the access network multiplexer 60 may be, for example, but not limited to, an optic multiplexer or a digital subscriber line access multiplexer (DSLAM). From a multicast point of view, as described hereinbelow, connection between the access network multiplexer 60 and a set top box 110 may be a shared media connection, or any other type of connection, and there may or may not be a multicast hierarchy between the access network multiplexer 60 and the set top box 110.

The management application 50 communicates with the video service splicer 30, the advertisements video server 40, and the access network multiplexer 60. In addition, the management application 50 provides the functionality required to learn unsupervised profiles in television audiences, as is described in detail hereinbelow. It should be noted that in accordance with an alternative embodiment of the invention, the management application 50 may instead be located within a set top box 110 located within the customer premises 100A-100D.

Each customer premises 100A-100D at least contains a set top box 110A-110D and a transmission unit 120A-120D. While for exemplary purposes four customer premises 100A-100D are illustrated, one having ordinary skill in the art would appreciate that additional or fewer customer premises 100A-100D may be provided. The transmission unit 120 is capable of receiving advertisement streams and video streams and forwarding the streams to an appropriate set top box 110. For exemplary purposes, the customer premises 100A-100D is illustrated as also containing a computer 130A-130D, although a computer 130 is not intricate to the invention. It should be noted that while a single set top box is shown as being located within a customer premises 100, more than one set top box 110 may be located within the customer premises 100. In addition, in accordance with an alternative embodiment of the invention, the set top box may be a computer or any device that can decode a service. For the present example of an IPTV network, the set top box 110 receives a video service with certain TCP/IP parameters, such as, but not limited to, IP address and UDP port. It should be noted, however, that in a cable network or a satellite network, the set top box 110 may or may not receive TCP/IP parameters.

The present system enables editing of online personal video so as to provide personalized television advertisements directed toward a viewer presently watching the television. As is described in detail below, the present invention is capable of categorizing a viewer into an advertising profile, an example of which is, but in not limited to, a demographic profile. Within a single customer premises, different television viewers may have different profiles. The different television viewers may view the same television during the day. Each different viewer may be associated with a different advertising profile, such as, but not limited to a demographic profile, thus preferably receiving different advertising messages. As an example, a family structure may be described as having an adult male of age 45, an adult female of age 42, a male teenager of age 17, a female teenager of age 14, and a male child of age 7. It should be noted that while the present description refers to a demographic profile, other types of profiles may be provided for.

During the time that a television viewer consumes service transmissions the management application 50 identifies the profile of the viewer. After identifying the profile, the application 50 performs personalized advertisements editing for that particular profile. When there is a different viewer with a different advertising profile that is using the same video decoder, the management application 50 identifies the profile that the viewer belongs to and performs online personalization editing for the advertisements, as described below.

In accordance with the present invention, for both supervised and unsupervised learning, the television consumers, also referred to herein as viewers, are not individually identifying themselves to the system. As a result, the system is required to identify consumer profiles and to associate the profiles with a specific set top box. This process is described in detail hereinbelow. Prior to describing this process, a general process of IPTV advertisement insertion in a broadcast environment is described in detail.

A typical advertisement projection works as follows. During content consumption the access network multiplexer 60 receives a video signal and sends the video signal to the customer premises 100A-100D using an IP protocol. During an advertisement break the video transmissions continue to be transmitted in multicast, thus there is no personalization of advertisements. To instead personalize advertisements, the following is performed.

FIG. 2 is a flow chart 200 further illustrating the process of personalizing advertisements, in accordance with one exemplary embodiment of the invention. Any process descriptions or blocks in flow charts should be understood as representing modules, segments, or portions of code that include one or more executable instructions for implementing specific logical functions or steps in the process, and alternative implementations are included within the scope of the embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

As shown by block 202, content is transmitted from the head end 20, via the access network multiplexer 60, to the set top box 110. An example of a protocol that may be used for the transmission is the Internet group management protocol (IGMP), which is used by IP hosts to manage their dynamic multicast group membership. Of course, other protocols may be used.

In accordance with the present example, a subset, or complete set, of the customers that are connected to the access network multiplexer 60 are viewing the same video and/or audio service (i.e., content). The management application 50 also continuously identifies the consumers (block 204). It should be noted that the management application 50 can utilize either online processing or offline processing to determine a relationship between viewed content (e.g., videos) and viewer profiles. Regarding offline processing to identify consumers, associate the consumers with content, and produce reports, in accordance with a predefined schedule, or when prompted to do so, the management application 50 reviews zapping patterns, processes the patterns, and associates each program viewed from a set top box 110 with a viewer profile. Alternatively, for online processing, during an advertising break, the management application 50 reviews only recent zapping events to determine which viewer is presently viewing content. Further description of consumer identification is provided with regard to FIG. 3, FIG. 8, and FIG. 10. It should be noted that the information received by the management application 50 may be received from a source other than a set top box.

Returning to the flowchart 200 of FIG. 2, the management application 50 decides which advertisements of the advertisement set each consumer should receive (block 206). It should be noted that the process of selecting advertisements is described in detail herein.

As shown by block 208, the video splicer 30 then splices the advertisements according to the decision of block 206. Since one having ordinary skill in the art would know how a video splicer splices advertisements, further description of the splicing process is not provided herein. As shown by block 210, when the advertisement break is over, the access multiplexer 60 continues to transmit the multicast transmission as it did prior to the advertisement break.

It should be noted that if during an advertisement break the consumer changes the consumed video service, the management application 50 supplies the new service in the same manner. Specifically, if the service transmits content, the management application 50 continues to transmit the content with the multicast protocol. In addition, if there is an advertisement break, the management application 50 may splice different advertisements.

As previously mentioned, the present system provides a consumer specific advertising environment. This environment is provided in part by the providing of online multilayer multicast groups between the access network multiplexer 60 and the set top boxes 110A-110D. The access network multiplexer 60 transmits broadcast transmissions with multicast protocol to a subset A of the set that is connected to the access network multiplexer 60. In the subset A there are different subsets B of consumers watching the same channel at a given moment that are connected to the access network multiplexer 60. Within a single subset B, consumers are associated by their profile for advertising. When there is an advertisement break, the access network multiplexer 60 is transmitting an additional layer of multicast, where each different subset Bi is receiving different advertisements according to the advertisement profile associated with subset Bi. Finally, when the advertisement break is over, subset A consumers continue to watch the same service.

While the abovementioned provides an example of an IPTV network 10, a different infrastructure in which the present system and method may be provided includes a cable network 400. FIG. 4 is a schematic diagram illustrating an example of a cable network 10 in which the present system may be provided. While there are similarities between the IPTV network of FIG. 1 and the cable network 400 of FIG. 4, there are also differences, which are described herein.

Referring the FIG. 4, a cable head end 410 of the cable network 400 is very similar to the IPTV head end 20 of the IPTV network 10. It should be noted, however, that instead of an access network multiplexer 60, the cable network 400 contains an RF interface 410, which may be, for example, but not limited to, a quadrature amplitude modulation (QAM) modulator and/or a radio frequency (RF) combiner. The cable network 400 provides for individual coaxial cables to provide communication capability from the cable head end 410 to individual set top boxes 430A-430H, where each set top box is located within a customer premises (CP) 440A-440H, such as, but not limited to, a home.

Another example of a network in which the present system and method may be provided is a satellite network. FIG. 5 is a schematic diagram illustrating an example of a satellite network 500 in which the present system may be provided. The satellite network 500 contains a satellite head end 510 that is similar to the IPTV head end 20, except that the satellite head end 510 contains an RF modulation interface 520. The RF modulation interface 520 is capable of formatting and amplifying received data for transmission to a satellite 550.

The satellite 550 is capable of reflecting received data to satellite dishes 560A-560N capable of receiving data signals from the satellite 550. Each satellite dish 560A-560N is associated with a customer premises 570A-570N, such as, for example, a home. In addition, each customer premises 570A-570N has at least one set top box 580A-580N located therein.

Still a further example of a network in which the present system and method may be provided is a terrestrial network. FIG. 6 is a schematic diagram illustrating an example of a terrestrial network 600 in which the present system may be provided. The terrestrial network 600 contains a terrestrial head end 610 that is similar to the IPTV head end 20, except that the terrestrial head end 610 contains an RF modulation interface 620. The RF modulation interface 620 is capable of formatting and amplifying received data for transmission to a radio tower 650.

The radio tower 650 is capable of reflecting received data to antennas 660A-660N capable of receiving data signals from the radio tower 650. Each antenna 660A-660N is associated with a customer premises 670A-670N, such as, for example, a home. In addition, each customer premises 670A-670N has at least one set top box 680A-680N located therein.

In accordance with the present invention, the management application 50 identifies the consumer profiles that are using video/audio decoders (i.e., set top boxes) in the network 10. For exemplary purposes the example of a single household having two television sets is provided. Each television is connected to a different set top box. A first television A is located in the living room and a second television B resides in a room for children.

In accordance with the present example, there are three consumer demographic profiles in the household, namely:

1. Profile 1: Male adult of age 37

2. Profile 2: Female adult of age 34

3. Profile 3: Male child of age 8 and male child of age 10

The consumer profiles are associated with the television sets as follows:

Television A—profiles 1, 2, and 3 (all the household residents are consuming content via television A).

Television B—profile 3 (only the children are using television B) The process of identifying and associating consumer profiles to set top boxes may be separated in accordance with whether a supervised learning process is used or an unsupervised learning process. These two scenarios are described separately hereinbelow, although it will be noted that certain steps in the processes are similar.

In accordance with the present example, for both the supervised and unsupervised scenarios, service providers have no knowledge of the profiles existing in the household, the location of the television sets in the household, and/or associations between the television sets and the profiles. Instead, the management application 50 identifies and associates the consumer profiles with the set top boxes.

Supervised Learning

Reference is now made to the flowchart 300 of FIG. 3. The flowchart 300 of FIG. 3 further illustrates the process of identifying and associating consumer profiles to set top boxes 100A-100D within a supervised learning scenario. As shown by block 302, to acquire a sample, the service provider may send a questionnaire to the consumers. Alternatively, the service provider may use any other method of obtaining data, such as, but not limited to, having a telephone conversation. The questionnaire may refer to the household demographic details, video decoders (i.e., set top boxes), and association between the usage of each person in the household and the video decoders in the household. As shown by block 304, consumers fill out the questionnaire and return the same to the service provider. With the return of the consumer questionnaire, it is known which individual profiles and set top boxes are associated with a household.

As shown by block 306, set top boxes 110 in the network 10 record all of the zapping events that the consumers are creating. In accordance with the present description, and as is known by those having ordinary skill in the art, zapping refers to the switching from the current service to another service via use of, for example, but not limited to, a remote control or pushing buttons on the video decoder. It should be noted that this use of remote controls is provided for exemplary purposes. Instead, zapping may be associated with switching initiated by voice commands, or even consumer motions without pressing buttons.

As shown by block 308, the set top boxes 110 send the zapping events to the management application 50. The management application 50 then associates behavior of consumers and their zapping pattern with the households that either did not return the questionnaire or that never received a questionnaire (block 310).

The association process is a learning process, also referred to as a business process, which is the process of passive platform audience learning and identification, and targeted platform rating calculation and analysis. The learning process is divided into multiple steps, including data collection, modeling, learning, identification, analysis, and post processing. FIG. 7 is a flow chart 700 further illustrating the steps of the supervised learning process.

Data Collection

Referring to FIG. 7 and the step of data collection, in order to perform audience learning, audience identification, and targeted rating calculation, certain external data is collected and converted into an internal format (block 702). This external data includes the zapping log, the broadcast schedule, set top box information, and sample information. The zapping log includes the actions that were performed by the set top box user using a remote control, directly using set top box control buttons, or performing a different action that caused changing from a current service to another service, or from a current state of the set top box to another state of the set top box (e.g., switching on or off). The broadcast schedule (or AsRun) includes, for example, a timetable for the platform channels/programs during the zapping gathering period. It should be noted that the broadcast schedule may also include a schedule of video on demand programs, or a schedule of any interactive service. The broadcast schedule should be reconciled with the zapping log in terms of times and channels identifications. The set top box information includes the relevant information, for every set top box for which zapping was collected, (e.g., unique set top box identifier and address). The set top box information should also be reconciled with the zapping log in terms of set top box identifications.

Modeling

Modeling is the process of converting the zapping log into different data models that could be used by different learning and identification algorithms, thereby providing a set top box signature (block 704). In accordance with the present system and method, at least the following data models are recognized. A first data model that is recognized is a set top box viewing signature. Regarding the set top box viewing signature, for each set top box, the list of “watched” programs could be created based on the zapping log and reconciled broadcast schedule. For each watched program, an aggregated watching percentage is given. As an example, STB1 watched program number 56, 30%, means that STB1 watched 30% of the program, on overall (including leaving the program and getting back to it), during the whole time of broadcast of program number 56. A second data model that is recognized is a set top box time signature. The set top box time signature is, for each set top box, the list of percentages of viewing every channel during the specific time aggregated for weekdays. As an example, set top box 1 (STB1) watched CNN on Sundays between 12:00 and 13:00, 25%, means that during the learning period, the average time that this particular set top box watched CNN between 12:00 and 13:00 on Sundays was fifteen minutes.

A third data model that is recognized is a set top box zapping frequency signature. Specifically, every profile does zapping with different frequencies. Calculating zapping frequencies of every set top box during the predefined time periods provides a Zapping Frequency Signature.

Unfortunately, the zapping data is not noise free. Most of the viewers use the remote control in the same fashion, but there is a small minority of users that would use the remote control differently. This affects the general zapping frequency, surfing periods (when the viewer changes the channels with high frequency in order to find something interesting), etc. In order to handle these irregular behaviors, a set of data filters should be applied to the zapping log prior to modeling.

Learning

For supervised learning, learning is a process in which the set top box signatures (viewing, time, and/or zapping frequency), created at the data modeling stage, are used with a list of set top boxes and profiles to provide an Association Rule (block 706). The Association Rule provides knowledge of how to associate a list of profiles within a network to a set top box within the network. The Association Rule is determined due to not having received filled out questionnaires from all parties and wanting to determine unknown relationships between profiles and set top boxes.

It should be noted that during supervised learning, it is not determined which profiles are associated with which set top boxes. Instead, as mentioned above, an Association Rule is determined to provide knowledge of how to associate a list of profiles to each set top box.

As mentioned above, during supervised learning there is an association of set top box signatures (e.g., viewing) for each set top box in the data model to a predefined list of profiles, based on a sample, for further use in the identification functionality. A sample is a partial list of set top boxes for which both the zapping log and the list of profiles associated with each set top box are provided. The sample may be provided by an operator of the set top box collection. Predefined profiles can be, for example, but not limited to, demographic profiles that define gender, age, marital status, income level, or psychographic (behavioral) profiles.

The Association Rule can be applied to any set top box in the same network, as is performed during identification. An example of a process that may be used to derive the Association Rule follows. The management application 50 contains knowledge of the current consumed service for a specific decoder, the profiles (demographic, or behavioral) associated with a specific decoder and household, and previously consumed content for a specific decoder. In accordance with the present invention, the management application 50 uses inference functionality to determine the current viewer/listener profile. The inference functionality defines the current profile(s) that is/are consuming the service.

An example of inference functionality follows, where the learning functionality uses Bayes rule. At this point, the management application 50 contains knowledge of the current consumed service for a specific decoder (set top box). In addition, the management application 50 knows the demographic profiles associated with a specific decoder and household. Further, the management application 50 knows previously consumed content for a specific decoder, specifically, the short-term history. The management application 50 may then use the inference functionality to determine the current viewer/listener profile.

An example for the inference functionality using Bayes rule is provided hereinafter. In the learning algorithm, data collection determines the distribution of the consumed content as a function of the classification of the viewers/listeners at the household. In addition, using the data in conjunction with the Bayes rule, the probability that the household contains a viewer/listener belonging to each demographic profile is estimated. Data utilized to perform this process includes probabilities of each consumed service for households containing each of the demographic profiles, as well as probabilities of each consumed service for households not containing each of the demographic profiles.

Bayes rule reads as shown by equation 1 below.

P(C|F1 . . . Fn)=P(F1 . . . Fn|C)*P(C)/(P(F1 . . . Fn|C)*P(C)+P(F1 . . . Fn|˜C)*P(˜C))   (Eq. 1)

In equation 1, P (F1 . . . Fn|C) is the probability that a household containing a certain profile (C) consumes the list of services F1 . . . Fn and does not consume any other service. In addition, P (F1 . . . Fn|˜C) is the probability that a household not containing a certain profile (C) consumes the list of services F1 . . . Fn and does not consume any other service. Further, P(C) is the probability that a household contains profile C, regardless of the services consumed and P(˜C) is the probability that a household does not contain profile C, regardless of the services consumed.

P(F1 . . . Fn|C) and P(F1 . . . Fn|˜C) may be approximated as the products P(F1|C)* . . . *P(Fn|C) and P(F1|˜C)* . . . *P(Fn|˜C) respectively, which may be calculated directly from the statistics gathered for the sample population. Better approximations may be obtained by considering correlations between services and between profiles in a household. From the above calculation, the result is the probability, P(C|F1 . . . Fn) that a household contains profile C, given the list of the household consumed services. The collection of all values P(C|F1 . . . Fn), calculated for the whole of sample set top boxes represents the Association Rule used for the identification step, applied to each set top box in the network, which was not part of the sample set top boxes. In addition, from this calculation, the result is the probability that a certain individual viewer from a specific profile used the set top box.

In accordance with an alternative embodiment of the invention, a sample may be provided, and post processing may be provided to associate content with profiles. Specifically, a sample may include at least one profile, a set top box associated with the profile, and zapping information associated with the set top box. Post processing may then be performed on the sample to determine which content (e.g., advertisement) is most appropriate for providing to the consumer associated with the profile. As a result, in accordance with this alternative embodiment of the invention, the learning process is not required.

Identification

Identification is a process of recognition of a list of profiles as being associated with a certain set top box (STB), based on the learning results. Every set top box in the network should be assigned with at least one profile (demographic, or behavioral). It is conceivable to assume that in front of a set top box, mostly there is more than one active profile and there are cases where the same profile should be associated a few times to the same set top box. Thus, for each set top box there should be assigned one or more profiles. For example, a young couple (male & female) between the ages of 20-30 that are living together would produce 2 profiles, specifically, one for the female and the other for the male. As another example, if a specific household has two boys of the ages seven and fourteen, the boys may both be assigned to an appropriate set top box as the same profile, “Male 6-18.”

To determine the list of profiles associated with a set top box, the Association Rule is mathematically applied to the list of set top box signatures (block 708).

Analysis

Analysis is the process of breaking down and studying the results of learning and identification in order to estimate possible identification errors, provide a set of different factors and amendments for post processing, association of definition of profiles by signatures to a third party definition, and any other functionality resulting from studying the learning and identification results.

The identification error analysis may be performed via mathematical modeling means and/or via simulation (empirical) means. For example, estimation of expected identification errors may be achieved via applying the learned results to a part of the sample and simulating the identification results.

Post Processing

Post Processing is the process of calculating the data required for presentation to potential customers, such as, targeted rating. Post processing also includes reporting and analyzing based on results of identification. The aforementioned list of results is obtained via post processing functionality described hereafter. Such functionality may be provided by, for example, algorithms. Post processing may be utilized to calculate the following data, although post processing calculation is not intended to be limited to calculating only this data; rather, by post processing any calculation done with the use of the results obtained from the learner and/or identifier is referred to as a post processed calculation/algorithm. Targeted Rating

The targeted rating of a content per profile may be calculated (e.g., using optimization algorithms, see example herein below) of the learned and identified data, or of any independent data (e.g., obtained only from the sample) as long as it contains information about the set top box signatures (e.g., viewing signatures) and the profile(s) associated to each set top box in the input. A content may be, for example, but not limited to, a program. If the network covers more than one region; and information on the regions in which the different set top boxes in the network reside, a regional targeted rating (RTR) may be calculated using similar methods to the method described below. In addition, (regional) targeted rating of high resolution time steps (e.g., per each 30 seconds) may be calculated for each specific channel and profile.

The output of the targeted rating functionality is the percentage of each pre-defined profile that watched each of the contents, for example, programs, in the aggregation of the set top box signatures, for example, viewing signatures (see an example table below). An example of a method to calculate targeted rating given a list of set top boxes with viewing signatures and profile(s) associated to each one of them can be given via the use of a linear regression optimization algorithm: assuming that multiplying the set of parameters representing the association of profile(s) to set top boxes (let us call it A) by the aggregation of targeted rating probabilities of each of the profiles per each program watched by at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures (the yet unknown and desired output, let us call it B) corresponds to the parameters representing the aggregation of the set top boxes viewing signatures (part of the input, let us call it C), a minimization algorithm on the squared norm of

(AB−C)   (Eq. 2)

is then performed (a random initial guess is provided to the algorithm for the values of B). In other words, given A and C, the output of applying this algorithm is the set of probabilities, B, representing the probability of each profile to watch each of the programs broadcasted to the collection of set top boxes. An example table for such an output is presented below: If the pre-defined profiles are:

1. Female of age 30-55 with high income.

2. Male of age 18-40 with average income.

3. Male child of age 6-16 with low income.

4. Female child of age 6-16 with average income.

And the list of programs (as specified in the viewing signatures) is:

1. Saturday night live.

2. Lost.

3. 24.

Then the targeted rating (TR) output would be the following table:

Rating (in % of each Program ID Profile ID profile) 1 1  0.5% 2   1% 3 0.01% 4 0.04% 2 1   3% 2 1.54% 3 0.01% 4 0 3 1 2.31% 2 2.11% 3 0 4 0

Content to Profile Assignment

In addition to a targeted rating of a content (for example, program) per profile, a content to profile assignment (C2P) may be determined. Content may be, for example, but not limited to, a program. The present description provides an example for illustration purposes only. Similarly an assignment of any content in a specific time slot to a specific profile in the household that consumed this content may be made. Obtaining a content to profile assignment involves determining for each program that was watched by a certain set top box, which is the specific profile, of the profiles associated to this set top box, that watched the program. This can be done, for example, via use of algorithms applying algebraic manipulations to the sets of parameters representing the aggregation of viewing (or other) signatures of the set top boxes (such as C above), the parameters representing the association of profile(s) to set top boxes (e.g., A above) and parameters representing targeted rating probabilities (e.g., B above).

Total Viewership

Further, a total viewership may be calculated (using, e.g., a program—time slot map and applying to it a calculation algorithm which utilizes data obtained in the previous steps described here), which is the calculation of total aggregated viewing activities for each of the pre-defined profiles (these may be demographic or behavioral), during a twenty-four hours period for each week day.

For example, having the association of profile(s) with each set top box, represented as a set of probabilities (either obtained as an output from the learning and identification steps or given from an outside source), and given the set top box signatures (e.g., as an output from the data modeling stage), given in addition the broadcasting time table (showing for a pre-defined period of time at which time and date and for which duration each program was broadcasted), the following calculation is performed.

The data is aggregated and modulated in such a form that for each day of the week (24 hours) it is calculated how many of each of the pre-defined profiles watched any content during each of the pre-defined time intervals. For example, if the period decided upon is three months and there were 12 Sundays during this period, the 24 hour period is divided to intervals of 15 minutes and for each such interval it is calculated (using the set top box signatures and the data mentioned above) how many times each of the pre-defined profiles watched any content during each of the 15 minute intervals aggregated for all 12 Sundays on a 24 hours span. Then this information is presented in a graph showing the viewing peaks during a 24 hour Sunday divided to 15-minute slots per each profile. This is done for each day of the week (aggregated to the number of time this weekday appeared during the three months period).

In addition to the abovementioned, a targeted rating distribution may be determined, which involves, for every channel, for every profile, calculating the rating of the channel for every brief period of time (e.g., thirty seconds), for every minimally defined region. Further, a viewership flow may be determined, which includes, for every channel, calculating the number (or percentage) of viewers of every profile that join and leave the channel during every short period of time (e.g., thirty seconds), for every minimally defined region. Still further, creative reports may be determined such as, for example, during an advertisement break, for each second, calculating the rating and viewership flow. All the aforementioned are merely examples of the post processing possibilities.

In the supervised case, with the knowledge gained by the functionality of block 310, for any households that did not fill out the questionnaire, the management application 50 uses identification functionality to associate the rest of the set top boxes 110 with the profiles that are using the set top boxes 110 (block 312). An example of the functionality, which is used as a basis for such an identification functionality, is provided herein below. It should be noted that different relevant learning methods may be used to perform the identification functionality. Examples of such learning methods may include the use of any one of the following, or other learning methods: Bayesian learning, various statistical methods, artificial neural networks; decision trees; k-nearest neighbor; quadratic classifier; support vector machine; various optimization methods, and direct calculation of probabilities. Of course, other learning methods may be used and are intended to be included within the present description.

Viewership Flow

Using the identified profiles data and high-resolution time signatures, a viewership flow may be calculated. It should be noted that a high-resolution time signature is a representation of which channel each set top box watched during each time step of a specific time interval, such as, but not limited to, thirty seconds. In addition, a viewership flow is the number of viewers of each profile that left or joined watching a specific channel during each time interval (e.g., 30 seconds), during a day or any pre-defined time interval.

Viewership flow may be calculated using, for example, but not limited to, a high resolution regional targeted rating, in addition to the data of signatures and lists of profiles associated with each set top box.

Calculation of viewership flow is performed in a few steps. It should be noted that the following is an example of steps that may be used to calculate viewership flow, however, the following example is not the only way to calculate viewership flow and this example is not intended to be limiting. As a first step, the high resolution regional targeted rating is calculated. Calculation of the high resolution regional targeted rating provides, per each channel and per each viewer profile, the percentage of viewers of this viewer profile that watched this channel per each time interval (for example, 30 seconds) during each day of a specified period. Such targeted rating may be calculated, for example, but not limited to, using a method similar to the method described in the targeted rating section of the present description, where the word program is replaced by channel per time interval.

To calculate viewership flow, the differences between the targeted ratings of same viewer profiles, per different time intervals, may be calculated to record the change in number of viewers of each profile between successive time intervals. Moreover, using for example, but not limited to, the method described above as content to profile assignment, the number of viewers that left or joined the viewers of each channel at each time interval may be calculated. To summarize: the viewership flow application may contain various descriptions of changes in viewers per channel per time interval. For Examples of the abovementioned include, but are not limited to, targeted rating and the changes in targeted rating per time interval, and number of viewers of each profile who left or joined the viewers of the channel at each time interval.

Unsupervised Learning

Reference is now made to the flowchart 800 of FIG. 8. The flowchart 800 of FIG. 8 further illustrates the process of identifying and associating consumer profiles to set top boxes 100A-100D within an unsupervised learning scenario. It should be noted, that unlike with supervised learning, with unsupervised learning no sample relating viewer profiles to set top boxes is provided. Moreover, the type of viewer profiles might be unknown at the stage of the learning. As a result, the viewer profiles must be determined. It should be noted that different types of viewer profiles may exist, including, but not limited to, demographic and psychographic types of viewer profiles. For example, for the psychographic type of viewer profile, the profile may contain multiple categories, such as, but not limited to, watching habits, purchasing behavior, social class, lifestyle, opinions, and values.

To determine viewer profiles one of many methods may be used, such as, but not limited to, using clustering algorithms to find common denominators within a population in association with viewing habits of the population. An example of a method that may be used for profile learning and determination is provided below.

As shown by block 802, set top boxes 110 in the network 10 record all zapping events created by the consumers. The set top boxes 110 send the zapping events to the management application 50 (block 804). It should be noted that the zapping events include an identification of the set top box from which the zapping events were derived. The management application 50 then associates behavior of consumers and their zapping patterns (block 806).

FIG. 9 is a block diagram further illustrating functionality of the management application 50 as blocks of logic. As shown by FIG. 9, the management application 50 contains modeling logic 902, learning logic 904, identification logic 906, analyzer logic 908, profiles determination logic 910, post processor logic 912, and reporting logic 914. The logic of the management application 50 is further described in detail with regard to the logical flow diagram of FIG. 10.

FIG. 10 is a detailed logical flow diagram illustrating a sequence of events performed during unsupervised learning. The zapping log and the broadcast schedule (arrows 1) are the inputs to modeling functionality of the management application 50, the output of which is a collection of set top box signatures (arrow 2), wherein the collection of set top box signatures includes a signature for each set top box in the network. The set top box signatures may be one of multiple classes of signatures, wherein the classes of signatures include viewing signatures, time signatures, and zapping frequency signatures. Each set top box in the network may have multiple signatures, wherein the signatures for a single set top box are selected from the classes of signatures. In fact, for example, a single set top box may even have one or more of each class of signature. Each such set top box also has a unique identification (ID). Viewing signatures are vectors of all the programs watched during a specified period by each of the set top boxes in the network.

The set top box signatures are the input used by learning functionality (arrow 3) of the management application 50. The learning functionality clusters profiles into groups of profiles that are yet unresolved. It should be noted that an unresolved profile is a profile for which a type is not yet known. Specifically, the learning functionally, which is further described in detail below under the section entitled “learning”, is capable of using the set top box signatures and determining relationships between profiles to derive clusters of profiles, where a type of a profile is not yet known. As an example, an optimization algorithm may be used to cluster the profiles into groups of unresolved profiles, an example of which is illustrated below. The learning step may be performed a few times, to determine the number of existing profile groups available for identification from viewing signature data. This may be done by, for example, but not limited to, throwing out, after each iteration, the profile groups that have similarity to each other, which is greater than a pre-defined threshold.

As previously mentioned, the output of the learning functionality of the management application 50 is clusters of yet unresolved profiles (arrow 4). The clusters of the yet unresolved profiles, together with a profile description (arrows 5), are the input to the profiles determination functionality of the management application 50.

The profiles description is a classification, or definition, of profiles of viewers by groups that associates between, for example, viewing habits and purchasing habits of individuals. The profiles description is provided by an external source, such as, but not limited to, a single source researcher. It should be noted that the profile description input is some external definition of profiles that is fed to the system.

The profiles determination functionality performs a match between the profiles found by the learning functionality (unresolved profiles) and the profiles description from the external source, which determines whether to match the profiles to demographic clustering or to a specific psychographic clustering, for example, by consuming habits. The profile determination with respect to a given profile description may be done, for example, by performing a standard best match procedure on each of the profiles in both groups (unresolved and pre-defined) and by finding the best possible match to each profile from the unresolved group from the defined profiles. It should be noted that sometimes one unresolved profile might fit to two described profiles and vise versa—two or more unresolved profiles can match one profile from the described profiles group.

The output of the profiles determination functionality are the resolved profiles (arrow 6), which are the input, together with the set top box signatures, to an identification functionality (arrows 7).

In accordance with an alternative embodiment of the invention, the learning and the profiles determination functionalities may be performed simultaneously by combining these two functionalities (learning and profile determination) of the management application 50 into one. In accordance with this embodiment, the profiles description and the set top box signatures are both fed as inputs to the learning and profiles determination functionalities (arrows 3 and 5). In this case, the learning and profiles determination functionalities are performed together. The output of the learning and profiles determination functionalities is resolved profiles (arrow 6). In the case of combining these two functionalities, directing the learning process toward the input profiles description may be done by, for example, but not limited to, feeding the described profiles as an initial guess to the optimization process and using the number of the defined profiles as the number of profiles to found.

The resolved profiles are sometimes used together with the set top box signatures as an input to the identification functionality of the management application 50 (arrows 7), to associate each set top box in the network with at least one profile, during which, for example, a quantization process may be performed and each set top box in the network may be associated with at least one profile.

A quantization process is a process during which, rather than having a continuous range of probabilities of having each of the profiles associated with some set top box, some profiles would be decided as not associated to that set top box (due to having a too small probability of being associated), while other profiles would be decided as being associated (with some higher probability, or 1). A quantization process may be performed by, for example, calculating a statistical constant related to the association of profiles to set top boxes (see detailed explanation below) and performing rounding steps. A quantization procedure may be performed at various steps of the learning and identification process.

The identification of lists of profiles associated with each set top box in the network may be performed by, for example, but not limited to, combining the association rule between unresolved profiles to set top boxes and the association rule between resolved and unresolved profiles to create an association rule associating lists of resolved profiles to set top boxes. For example, the association rules may be matrices of parameters and the application of the association rules may be performed, by using matrix multiplication.

The output of the identification functionality (arrow 8) is the identification of which profile(s) uses each of the set top boxes in the network. In other words, the output is an identification of at least one profile associated with each set top box in the network.

The profiles description, set top box signatures, and profiles associated with each set top box (arrows 9) are fed to analyzer functionality of the management application 50, the output of which is an estimation of identification quality and error estimation (arrow 11). Specifically, the analyzer is a self-assessment tool of the management application. The analysis in the case of unsupervised learning is performed with respect to the profiles definition input. The output of the analyzer may be, for example, the quality of the ability of the system to classify the profiles into groups according to the given profile definition, ranking the quality of the input data in view of desired output versus the actual output, and error estimation regarding the accuracy of the identification process.

The estimated errors may be, for example, the expected deviation from the actual situation, and false positive and false negative identification rates. Moreover, correlations between the different profiles groups may be calculated, thereby providing information regarding identification possibilities of certain profiles with respect to their correlations with other profiles. This may be done, for example, by performing comparison of results with known statistics, or by comparing results obtained for all of the network with results obtained from a well representing subgroup of the network.

The identified profiles associated with a set top box are fed as an input, together with the set top box signatures (either the same ones used for the learning and identification functionalities, or others, such as time signatures or high resolution time signatures) and additional set top box data, if required, to post processor functionality of the management application 50 (arrows 12). The post processing functionality computes various data, such as: regional targeted rating (RTR), content to profile assignment (C2P), total viewership and viewership flow. A description of these functionalities was presented above. Note that the computation of the functionalities of the post processor may remain the same for data (associating lists of profiles to set top boxes) obtained via supervised learning, unsupervised learning, or an external source.

Reporting functionality of the management application 50 uses the computed data to produce business and other reports (arrow 13). As with the supervised scenario, the association process, also referred to as the learning and identification process, is divided into multiple steps. The steps in the association process include data collection, modeling, learning, profiles determination, identification, analysis, and post processing. Of the multiple steps, usually the data collection, modeling, analysis and post processing remain the same for both the supervised and unsupervised processes. The main difference in the supervised and unsupervised processes is in the learning step, which may also include a profile determination step, and which may inflict some differences in the identification steps. Note that the steps of learning, profile determination, and identification are sometimes called here for short, “unsupervised learning”. The unsupervised learning process is further defined herein below.

Learning

For unsupervised learning, each set top box signature is learned to be associated with a certain list of unresolved profiles defined solely using the set top box signatures. Examples of such set top box signatures include, but are not limited to, viewing signatures, time signatures, high resolution time signatures, and zapping frequency signatures. It should be noted that the main difference from the supervised learning process is that no sample is provided in this case. An unsupervised learning algorithm receives the set top box signatures only as an input, resulting in a classification of profiles into, for example, a certain type of psychographic (for example, behavioral) or demographic profile groups. After the first step (unless the steps of learning and profile resolving are combined) the resulting learned profiles are usually yet unresolved, meaning that their nature is yet to be resolved.

Examples of unsupervised learning algorithms include, but are not limited to, least squares algorithms and algorithms that provide minimization via steepest decent. Other outputs from the learning algorithms include an association of profiles to set top boxes and obtaining a targeted rating of the defined profiles at the same time, thereby providing a probability that a profile is associated with a set top box.

The following is provided as an example of an unsupervised learning algorithm. An input to the unsupervised learning process is the collection of set top box signatures, which is the output of the data modeling process. Assume as an example that these are viewing signatures (although these might be time signatures, etc.), where we denote their parametrical representation by a matrix C. For example, each row of the matrix C may refer to one set top box, and each column of the matrix C may refer to, for example, but not limited to, one program, where the entries of matrix C may be, for example, the portions of the programs that each set top box watched, or, for example, the probabilities with which each of the set top boxes represented in matrix C watched each of the programs represented in matrix C. Let us denote by a matrix A the collection of probabilities, representing viewer profiles association to the set top boxes, where the entries of the matrix A are the probabilities of each of the viewer profiles to be associated with each of the set top boxes. Note that the viewer profiles might be yet unresolved viewer profiles at this stage. Let us denote by the matrix B, targeted rating probabilities. Both A and B are unknown in the case of unsupervised learning. To obtain the desired outputs A and B, we use, for example, but not limited to, the following method. We minimize the squared norm of the difference (AB−C) (see equation 3), to obtain the approximation of the matrix C as the product AB. For this, we are using, for example, but not limited to, a convex optimization algorithm (or, for example, some other nonlinear minimization algorithm) under various constrains, such as, but not limited to, that each quantity in A is greater than zero and smaller than one, and each quantity in B is greater than zero and smaller than, for example, 0.5. The following description further describes this process.

Following this example, to determine a possible algorithm for achieving the minimization of the squared norm of the matrix (AB−C), (see equation 3), considered above, it is assumed that the population consists of viewers that can be divided into several groups of different profiles, where each viewer may belong to one or more group of viewers profiles. Each such group of profiles is associated, for example, with a behavior pattern in terms of watching habits, where the pattern consists of, for example, but not limited to, the viewing signatures and the targeted rating per content and per each profile, where the targeted rating for the profile is the probability of a viewer of this profile watching each program, or some other definition of content.

Since usually the number of all possible profile groups is low compared to the number of programs and set top boxes in the network, one is actually looking for a low rank approximation of the matrix C, the term low rank (of matrices A and B) refers in this case to the fact that the number of different profile groups is smaller than the dimensions of C, representing for example the number of programs and the number of set top boxes in the network, where due to this low rank the matrices A and B may be obtained using this approximation. One approach to obtaining a low rank approximation of the matrix C is to search for the matrices A and B that minimize the squared norm of the matrix (AB−C). This can be done using, for example, a convex optimization method on the quantity of equation 3, which reads:

$\begin{matrix} \begin{matrix} {n = {{{AB} - C}}^{2}} \\ {= {\sum\limits_{i,j}\left( {{\sum\limits_{k}{A_{ik}B_{kj}}} - C_{ij}} \right)^{2}}} \\ {= {{Trace}\left( {\left( {{AB} - C} \right)^{T}\left( {{AB} - C} \right)} \right)}} \end{matrix} & \left( {{Eq}.\mspace{14mu} 3} \right) \end{matrix}$

where n denotes the squared norm of (AB−C), and trace is a known operation on a matrix providing the sum of the diagonal. In order to minimize this efficiently, one may use the derivatives of equation 3, described in equations 4 and 5, each of which read as follows:

$\begin{matrix} {\frac{\partial n}{\partial A_{ab}} = {\left. {2{\sum\limits_{i,j}{\left( {{A_{ai}B_{ij}} - C_{aj}} \right)B_{bj}}}}\Rightarrow\frac{\partial n}{\partial A} \right. = {2\left( {{AB} - C} \right)B^{T}}}} & \left( {{Eq}.\mspace{14mu} 4} \right) \end{matrix}$

and correspondingly,

$\begin{matrix} {\frac{\partial n}{\partial B} = {2{A^{T}\left( {{AB} - C} \right)}}} & \left( {{Eq}.\mspace{14mu} 5} \right) \end{matrix}$

The second derivatives may also be calculated in order to perform this minimization and they are given by the combination of equations 6, 7, and 8 below:

$\begin{matrix} {\frac{\partial^{2}n}{{\partial A_{ab}}{\partial A_{c\; d}}} = {2{\delta_{a\; c}\left( {BB}^{T} \right)}_{bd}}} & \left( {{Eq}.\mspace{14mu} 6} \right) \\ {\frac{\partial^{2}n}{{\partial B_{ab}}{\partial B_{c\; d}}} = {2{\delta_{bd}\left( {A^{T}A} \right)}_{a\; c}}} & \left( {{Eq}.\mspace{14mu} 7} \right) \\ {\frac{\partial^{2}n}{{\partial A_{ab}}{\partial B_{c\; d}}} = {{2A_{a\; c}B_{bd}} + {2{\delta_{bc}\left( {{AB} - C} \right)}_{ad}}}} & \left( {{Eq}.\mspace{14mu} 8} \right) \end{matrix}$

Using any standard convex optimization technique and the derivatives above with the (convex) constraints 0≦A_(ij),B_(ij)≦1, a solution of the optimization problem may be found, where the joint dimension of the matrices A and B is chosen as the desired, or expected, number of profiles.

The matrix A is to be understood as the set of probabilities of association of each of the profiles per each of the set top boxes and the matrix B is the targeted rating matrix. Since the matrix A is expected to contain binary quantities (either a profile exists in a household or not), and since the optimal solution is defined up to a multiplicative constant for each profile, it is desirable to find a good quantization criterion for A.

Instead of the above-described example, for the unsupervised learning algorithm, one may consider the slightly more complex example described below. Moreover, these alternative ways may be used to address specific different cases and the present invention is not limited to these examples. An example of an alternative way is, instead of minimizing the squared norm of the matrix (AB−C), minimizing the squared norm of (B−(A⁺)C), denoted herein by m:

m=∥B−(A ⁺)C∥ ²   (Eq. 9)

In addition, it is also possible to minimize the squared norm of (A−C(B⁺)), denoted by v:

v=∥A−C(B ⁺)∥²,   (Eq. 10)

where A⁺ denotes the pseudo-inverse of the matrix A, and B⁺ denotes the pseudo-inverse of the matrix B. For example, the Moore-Penrose pseudo-inverse may be used. This enables a reduction of the dimensionality of the problem as the dimensions of the later matrices are usually much smaller than of the matrix (AB−C). Further, this approach creates a sharper distinction between the probabilities in A (desired to be binary) and of B (usually small probabilities representing targeted rating) in the minimization process. The pseudo-inverse of a matrix is unique in mathematical terms, hence minimizing equations nine or ten is well defined. In the case of minimizing, for example, the quantity m, one would need to use the derivatives

${\frac{\partial m}{\partial A}\mspace{14mu} {and}\mspace{14mu} \frac{\partial m}{\partial B}},$

which involves calculating derivatives of the form

$\frac{\partial A^{+}}{\partial A_{ab}},$

where:

$\begin{matrix} {\frac{\partial A_{ij}^{+}}{\partial A_{ab}} = {{\left( {A^{+}A^{+^{T}}} \right)_{ib}\delta_{ja}} - {A_{ia}^{+}A_{bj}^{+}} - {\left( {A^{+}A^{+^{T}}} \right)_{ib}\left( {A^{+^{T}}A^{T}} \right)_{aj}}}} & \left( {{Eq}.\mspace{14mu} 11} \right) \end{matrix}$

The result of applying the derivative in equation 11 to obtain the derivatives

$\frac{\partial m}{\partial A},{{and}\mspace{14mu} \frac{\partial m}{\partial B}},$

so as the second derivatives, of the quantity m, results in slightly longer expressions than the derivatives presented above, in equations 4-8, but similar in nature.

Moreover, instead of using convex minimization routines, we may use various nonlinear minimizations with slightly altered constrains to minimize the squared norms of the differences above.

An initial guess, for example, but not limited to, a random guess, is given to the algorithm for any of the probabilistic quantities in A and B. Additional constrains may be given to the algorithm to increase its accuracy. Of course, other optimization (or learning) algorithms may be used. The output is a set of probabilities, A, associating groups of profiles to the set top boxes, which later may be quantized and/or resolved (using, when needed a profile resolving procedure and quantization), and a set of probabilities, B, providing the targeted rating for each (for example) program and each profile (also to be used in the profile resolving scheme when needed). It should be noted that the targeted rating may be re-calculated during the post-processing to increase the accuracy.

It should be noted that the abovementioned examples, equations, and functionalities are based upon the general premise that matrix C can be approximated by matrix A multiplied by matrix B. Of course, further examples for achieving such approximation may be provided and such examples are intended to be included within the present invention.

Quantization

The quantization step is typically, but not necessarily, to be used after the learning and profile determination stage, in the identification functionality, or a few times during the steps of learning, profile determination, and identification.

One approach to finding the quantizing constants (a set of constants that each of the probabilities relating each of the found profiles to set top boxes should be divided by to determine whether a certain profile should indeed be associated with a certain set top box or not) is to assume that A is approximately a binary matrix with a constant multiplicative factor per column, s_(i) (1≦i≦number of profile groups), or in other words, assume that each of the i profile groups has its own quantization constant. Since the entries are supposed to be binary quantities, one expects the following from calculating the mean and variance using the binomial distribution, as shown by equations 12 and 13.

Σ_(a)A_(ai)=s_(i)N_(p)   (Eq. 12)

Σ_(a) A _(ai) ² /N−(Σ_(a) A _(ai))² /N ² =s _(i) ² pq   (Eq. 13)

where N is the number of set top boxes in the network, p is the probability that a profile is associated to a set top box, and q=1−p. Solving equation 12 and equation 13 for s_(i), dividing A_(ai)/s_(i) and rounding to a pre-defined threshold, leads to an association rule, associating each of the profiles (resolved or yet unresolved) to each of the set top boxes.

Profile Determination

Profile determination, or resolving, is a process that defines the nature of identified profiles. During profile resolving, profiles definition, for example from a single source research results, such as, but not limited to, viewing habits and behavior, may be used as inputs. In addition, the profile list and targeted rating of defined profiles may be used as inputs. The inputs are provided to a resolving algorithm resulting in profile descriptions that describe each profile in the list.

The single source research addresses a focus group that answers a questionnaire. There are two groups of questions in this questionnaire, namely, a first group and a second group. The first group refers to identity of a person, examples including behavior (i.e., purchasing behavior, rest and relaxation preferences, etc) and demographic profile of the answering person. The second group refers to media consumption, for example, about the time a person would watch television each day of the week and his preferred shows.

The single source research associates the media consumption habits with other habits, such as, but not limited to, purchasing habits and preferred vacation habits. The output of the single source research is a set of profiles and their habits, while each profile is associated with its media consumption habits. The resolving algorithm finds the best correlation between two sets of data, namely, for example, the media consumption habits of the focus group; and, for example, the targeted rating of the defined profiles (the output of the unsupervised learning algorithm). Therefore, the resolving algorithm has the capability of defining the traits of the learned profile in the unsupervised algorithm.

In accordance with the present invention, after the learning and identification are performed, the management application 50 knows online, or offline, the current psychographic or demographic profiles that are consuming content for at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures. The information regarding the current demographic/psychographic profiles that are consuming content for set top boxes within the network for which sufficient input was received, may be the basis for personalized advertisements deployment in accordance with the present invention.

It should be emphasized that the above-described embodiments of the present invention are merely possible examples of implementations, merely set forth for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments of the invention without departing substantially from the spirit and principles of the invention. All such modifications and variations are intended to be included herein within the scope of this disclosure and the present invention and protected by the following claims. 

1. A method of performing unsupervised learning to associate viewer profiles in video audiences of a network, comprising the steps of: receiving a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network; deriving set top box signatures from the zapping log and broadcast schedule; clustering viewer profiles into groups of viewer profiles using the set top box signatures; and associating at least one set top box within the network with at least one viewer profile, wherein the method of performing unsupervised learning does not use data associating demographic or psychographic profiles to the at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures.
 2. The method of claim 1, wherein said zapping signatures include events where there is a switching from a current service to another service and/or other means for communicating with the set top box.
 3. The method of claim 1, further comprising the step of determining a targeted rating of a content per profile and a regional targeted rating.
 4. The method of claim 1, further comprising the step of determining a total viewership.
 5. The method of claim 1, further comprising the step of determining a viewership flow per a pre-defined time step.
 6. The method of claim 1, further comprising the step of determining a content to profile assignment.
 7. The method of claim 1, wherein the set top box signatures comprise at least one signature for each set top box in the network.
 8. The method of claim 7, wherein the set top box signatures may include at least one class of signature selected from the group consisting of viewing signatures, time signatures, high-resolution time signatures, and zapping frequency signatures.
 9. The method of claim 1, wherein the groups of viewer profiles resulting from the step of clustering are unresolved viewer profiles, where a type of the viewer profile is not known.
 10. The method of claim 9, wherein an optimization algorithm is used to perform the step of clustering.
 11. The method of claim 9, further comprising the step of matching the unresolved viewer profiles and predefined profiles, to provide resolved viewer profiles.
 12. The method of claim 11, wherein the step of matching further comprises performing a best match procedure on both the unresolved viewer profiles and the predefined profiles.
 13. The method of claim 11, wherein the step of associating at least one set top box within the network with at least one viewer profile uses the resolved viewer profiles with the set top box signatures as input.
 14. The method of claim 1, wherein the step of associating at least one set top box within the network with at least one viewer profile uses a quantization process.
 15. The method of claim 1, wherein the step of associating at least one set top box within the network with at least one viewer profile further comprises providing a relationship between a matrix A, a matrix B, and a matrix C, where matrix A multiplied by matrix B approximates matrix C, where A represents a set of parameters representing the association of at least one profile to set top boxes, B represents an aggregation of targeted rating probabilities of each of the viewer profiles per each content watched by at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures, and C is an aggregation of the set top box signatures.
 16. The method of claim 14, wherein the quantization process further comprises calculating a mean and variance.
 17. A system for providing unsupervised learning to associate consumer profiles in video audiences, wherein the system comprises a head end having a computer and means for communicating therein, wherein the computer has a management application stored therein, and wherein the management application further comprises: logic configured to receive a zapping log and a broadcast schedule, wherein the zapping log includes records of set top box zapping signatures for at least a portion of the set top boxes of the network; logic configured to derive set top box signatures from the zapping log and broadcast schedule; logic configured to cluster viewer profiles into groups of viewer profiles using the set top box signatures; and logic configured to associate at least one set top box within the network with at least one viewer profile, wherein performing unsupervised learning does not use data associating demographic or psychographic profiles to the at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures.
 18. The system of claim 17, wherein said zapping signatures include events where there is a switching from a current service to another service and/or other means for communicating with the set top box.
 19. The system of claim 17, wherein the management application further comprises logic configured to determine a targeted rating of a content per profile and a regional targeted rating.
 20. The system of claim 17, wherein the management application further comprises logic configured to determine a total viewership.
 21. The system of claim 17, wherein the management application further comprises logic configured to determine a viewership flow per a pre-defined time step.
 22. The system of claim 17, wherein the management application further comprises logic configured to determine a content to profile assignment.
 23. The system of claim 17, wherein the set top box signatures comprise at least one signature for each set top box in the network.
 24. The system of claim 23, wherein the set top box signatures include at least one class of signature selected from the group consisting of viewing signatures, time signatures, high-resolution time signatures, and zapping frequency signatures.
 25. The system of claim 17, wherein the groups of viewer profiles resulting from clustering are unresolved viewer profiles, where a type of the viewer profile is not known.
 26. The system of claim 25, wherein an optimization algorithm is used to perform clustering.
 27. The system of claim 25, wherein the management application further comprises logic configured to match the unresolved viewer profiles and predefined profiles, to provide resolved viewer profiles.
 28. The system of claim 27, wherein matching further comprises performing a best match procedure on both the unresolved viewer profiles and the predefined profiles.
 29. The system of claim 27, wherein associating at least one set top box within the network with at least one viewer profile uses the resolved viewer profiles with the set top box signatures as input.
 30. The system of claim 29, wherein associating at least one set top box within the network with at least one viewer profile uses a quantization process.
 31. The system of claim 17, wherein associating at least one set top box within the network with at least one viewer profile further comprises providing a relationship between a matrix A, a matrix B, and a matrix C, where matrix A multiplied by matrix B approximates matrix C, where A represents a set of parameters representing the association of at least one profile to set top boxes, B represents an aggregation of targeted rating probabilities of each of the viewer profiles per each content watched by at least a portion of the set top boxes of the network for which the zapping log contains records of set top box zapping signatures, and C is an aggregation of the set top box signatures. 